Blind Signal Separation and Speech Recognition in the Frequency Domain

نویسندگان

  • Athanasios Koutras
  • Evangelos Dermatas
  • George Kokkinakis
چکیده

In this paper it is shown that a Blind Signal Separation (BSS) method in the frequency domain (FDBSS) improves significantly the speaker Signal to Interference Ratio (SIR) and the phoneme recognition score of a continuous speech, speaker-independent acoustic decoder in a multi-simultaneous-speaker office environment. Specifically, the efficiency of the presented FDBSS method is studied on a TITO (Two Input, Two Output) network. In extensive experiments in an artificially created environment using real-room impulse responses, the mean SIR resulting from the Output Decorrelation was increased by approximately 8dB. Furthermore, a percentage phoneme recognition improvement of 85% and 116% for each one of the separated speech signals compared to the mixed signals was measured. It is also shown that the complexity of the FDBSS method is significantly lower than in the time domain and for M-order linear separating filters is O(MlogM) compared to the O(M) in the time domain.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Doctoral Dissertation Blind Source Separation Based on Multistage Independent Component Analysis

A hands-free speech recognition system and a hands-free telecommunication system are essential for realizing an intuitive, unconstrained, and stress-free human-machine interface. In real acoustic environments, however, the speech recognition performance and a speech recording performance significantly degraded because we cannot detect the user’s speech with a high signal-to-noise ratio (SNR) ow...

متن کامل

Speech extraction in a car interior using frequency-domain ICA with rapid filter adaptations

This paper describes two new algorithms for blind source separation (BSS) based on frequency-domain independent component analysis (FDICA). One is FDICA with prefiltering by a speech sub-band passing filter to slow down the learning speed in low signal-to-noise ratio (SNR) sub-bands. The other is FDICA with sub-band selection learning to reduce the number of iterations for those sub-bands. The ...

متن کامل

A Novel Frequency Domain Linearly Constrained Minimum Variance Filter for Speech Enhancement

A reliable speech enhancement method is important for speech applications as a pre-processing step to improve their overall performance. In this paper, we propose a novel frequency domain method for single channel speech enhancement. Conventional frequency domain methods usually neglect the correlation between neighboring time-frequency components of the signals. In the proposed method, we take...

متن کامل

Blind separation of multiple speakers in a multipath environment

We relate information theoretic blind learning methods (infomax) and Bussgang blind equalization methods. The multipath extension of blind source separation methods can be seen in the frequency domain using FIR matrix algebra (matrices of nite impulse response lters). Three forms of Bussgang algorithms are given. The blind serial update method of Cardoso and Laheld is related to the infomax obj...

متن کامل

Improving simultaneous speech recognition in real room environments using overdetermined blind source separation

In this paper we present a novel solution to the Overdetermined Blind Speech Separation (OBSS) problem for improving speech recognition accuracy of N simultaneous speakers in real room environments using M, (M>N) microphones. The proposed OBSS system uses basic NxN Blind Speech Separation networks that process in parallel all different combinations of the available mixture signals in the freque...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007